2 research outputs found
Applications and Variations of the Maximum Common Subgraph for the Determination of Chemical Similarity
The Maximum Common Substructure (MCS), along with numerous graph theory techniques, has
been used widely in chemoinformatics. A topic which has been studied at Sheffield is the hyperstructure
concept - a chemical definition of a superstructure, which represents the graph theoretic union
of several molecules. This technique however, has been poorly studied in the context of similarity-based
virtual screening. Most hyperstructure literature to date has focused on either construction
methodology, or property prediction on small datasets of compounds.
The work in this thesis is divided into two parts. The first part describes a method for constructing
hyperstructures, and then describes the application of a hyperstructure in similarity searching in
large compound datasets, comparing it with extended connectivity fingerprint and MCS similarity.
Since hyperstructures performed significantly worse than fingerprints, additional work is described
concerning various weighting schemes of hyperstructures.
Due to the poor performance of hyperstructure and MCS screening compared to fingerprints, it was
questioned whether the type of maximum common substructure algorithm and type had an influence.
A series of MCS algorithms and types were compared for both speed, MCS size, and virtual screening
ability. A topologically-constrained variant of the MCS was found to be competitive with fingerprints,
and fusion of the two techniques overall improved active compound recall